MIDTERM EXAM HANDS ON - CHICAGO CRIMES DATA ANALYTICS

Analyst:Ostaga,Christian Joseph

In [2]:
#import libraries
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
In [3]:
df = pd.read_csv('Datasets/Chicago_Crimes.csv')
In [4]:
df
Out[4]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... Ward Community Area FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location
0 13439321 JH237424 04/14/2024 12:00:00 AM 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT False False ... 3 38.0 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712)
1 13437420 JH234779 04/14/2024 12:00:00 AM 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE False False ... 25 31.0 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849)
2 13428676 JH224478 04/14/2024 12:00:00 AM 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET False False ... 36 23.0 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478)
3 13429357 JH225293 04/14/2024 12:00:00 AM 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET True False ... 28 26.0 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826)
4 13430098 JH226395 04/14/2024 12:00:00 AM 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE False False ... 21 75.0 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247)
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 04/12/2025 12:00:00 AM 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT False False ... 1 22.0 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972)
249119 13804023 JJ215813 04/12/2025 12:00:00 AM 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET False False ... 9 49.0 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508)
249120 13803926 JJ215943 04/12/2025 12:00:00 AM 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT False True ... 21 71.0 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337)
249121 13803475 JJ215338 04/12/2025 12:00:00 AM 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET True False ... 20 61.0 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244)
249122 13804512 JJ216668 04/12/2025 12:00:00 AM 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET False False ... 27 28.0 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204)

249123 rows × 22 columns

In [5]:
df.columns
Out[5]:
Index(['ID', 'Case Number', 'Date', 'Block', 'IUCR', 'Primary Type',
       'Description', 'Location Description', 'Arrest', 'Domestic', 'Beat',
       'District', 'Ward', 'Community Area', 'FBI Code', 'X Coordinate',
       'Y Coordinate', 'Year', 'Updated On', 'Latitude', 'Longitude',
       'Location'],
      dtype='object')

QUESTIONS

1.What are the top 10 most common primary crime types in Chicago?

In [6]:
top_types = df['Primary Type'].value_counts().head(10)
plt.figure(figsize=(10,6))
sns.barplot(x=top_types.values, y=top_types.index, palette='viridis')
plt.title('Top 10 Most Common Primary Crime Types')
plt.xlabel('Number of Crimes')
plt.ylabel('Primary Type')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the bar plot we can see that the most common crime is theft

2. What are the monthly crime trends in Chicago?

In [7]:
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df['YearMonth'] = df['Date'].dt.to_period('M')

monthly_trend = df.groupby('YearMonth').size()

plt.figure(figsize=(14,6))
monthly_trend.plot(kind='line', marker='o', color='teal')
plt.title('Monthly Crime Trend in Chicago')
plt.xlabel('Year-Month')
plt.ylabel('Number of Crimes')
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the plot we can see that july has the highest crime rate

Insigt : Also we can see that after july i think theyve learn their lesson and do thing to make the crimes lower we can see in the plot that every month pass he crime rate is going down and we can see also that there is a bit of kick or the crime rate on march 2025 suddenly rise but it became fewer on april

3.Which locations have the highest number of reported crimes in Chicago?

In [8]:
top_locations = df['Location Description'].value_counts().head(10)

plt.figure(figsize=(10,6))
sns.barplot(x=top_locations.values, y=top_locations.index, palette='magma')
plt.title('Top 10 Locations with Highest Number of Crimes')
plt.xlabel('Number of Crimes')
plt.ylabel('Location Description')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base On the bar plot we can see that the location that has a lot of crime happen is on the street

4.What is the correlation between arrest rates and domestic crimes?

In [9]:
df['Arrest'] = df['Arrest'].astype(int)
df['Domestic'] = df['Domestic'].astype(int)
corr = df[['Arrest', 'Domestic']].corr().iloc[0,1]

plt.figure(figsize=(4,4))
sns.heatmap(df[['Arrest', 'Domestic']].corr(), annot=True, cmap='coolwarm')
plt.title(f'Correlation: Arrest vs Domestic ({corr:.2f})')
plt.show()
No description has been provided for this image

Insigt : Base on the heatmap there is a correlation. But practically, it's so weak. A correlation of 0.0053 suggests almost no relationship between the two.

5. What is the average number of crimes per weekday?

In [10]:
df['Weekday'] = df['Date'].dt.day_name()
weekday_counts = df['Weekday'].value_counts().reindex(
    ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday']
)

plt.figure(figsize=(8,5))
sns.barplot(x=weekday_counts.index, y=weekday_counts.values, palette='Set2')
plt.title('Average Number of Crimes per Weekday')
plt.ylabel('Number of Crimes')
plt.xlabel('Weekday')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the barplot we can see that there are so many crimes per day the gap is so close its almost like every day the crime is the same

6. What is the maximum number of crimes reported in a single day?

In [12]:
df['Date'] = pd.to_datetime(df['Date'])


daily_counts = df.groupby(df['Date'].dt.date).size()
plt.figure(figsize=(12,5))
daily_counts.plot(kind='line')
plt.title('Crimes per Day')
plt.xlabel('Date')
plt.ylabel('Number of Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the plot we can see that the crime numbers were mostly steady, but at the end it suddenly dropped to zero this means that particular day no one did a crime i think there is a big event that time even the criminals did nothing.

7.What are the most common crime descriptions?

In [13]:
df.columns
Out[13]:
Index(['ID', 'Case Number', 'Date', 'Block', 'IUCR', 'Primary Type',
       'Description', 'Location Description', 'Arrest', 'Domestic', 'Beat',
       'District', 'Ward', 'Community Area', 'FBI Code', 'X Coordinate',
       'Y Coordinate', 'Year', 'Updated On', 'Latitude', 'Longitude',
       'Location', 'YearMonth', 'Weekday'],
      dtype='object')
In [14]:
df
Out[14]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [19]:
description = df['Description'].value_counts().head(10)

plt.figure(figsize=(8,5))
sns.barplot(x=description.values, y=description.index, palette='Blues_r')
plt.title('Top 5 Most Common Crime Descriptions')
plt.xlabel('Number of Crimes')
plt.ylabel('Description')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the barplot we can see that the most common crime is SIMPLE, and it happened the most times. Other crimes include DOMESTIC BATTERY SIMPLE and $500 AND UNDER.

8. What is the trend of arrest rates over the years?

In [21]:
df['Year'] = df['Date'].dt.year
arrest_rate = df.groupby('Year')['Arrest'].mean()

plt.figure(figsize=(10,5))
arrest_rate.plot(marker='o', color='purple')
plt.title('Arrest Rate Trend Over Years')
plt.xlabel('Year')
plt.ylabel('Arrest Rate')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the plot we can see that the arrest rate increased a lot we can see that on 2024 it is very low but every months pass the arrest rate in inscresing rapidly and not even dropping.

9.what is the distribution of crime types for domestic vs non-domestic cases?

10.what is the correlation between domestic and non domestic?

In [22]:
df
Out[22]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [23]:
domestic_types = df[df['Domestic']==1]['Primary Type'].value_counts().head(5)
nondomestic_types = df[df['Domestic']==0]['Primary Type'].value_counts().head(5)

fig, ax = plt.subplots(1,2, figsize=(14,5))
sns.barplot(x=domestic_types.values, y=domestic_types.index, ax=ax[0], palette='Reds')
ax[0].set_title('Top 5 Domestic Crime Types')
ax[0].set_xlabel('Count')
sns.barplot(x=nondomestic_types.values, y=nondomestic_types.index, ax=ax[1], palette='Greens')
ax[1].set_title('Top 5 Non-Domestic Crime Types')
ax[1].set_xlabel('Count')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight for numbir 9 :Base on the 2 bar plot we can see that domestic crimes like battery and assault are very common crime in chicago, while non domestic crimes are mostly theft and damage. This shows that crime types differ depending on whether they happen at home or not.

In [25]:
crime_by_type_date = df.groupby(['Date', 'Primary Type']).size().unstack(fill_value=0)

top_domestic = ['BATTERY', 'ASSAULT', 'OTHER OFFENSE', 'CRIMINAL DAMAGE', 'THEFT']
top_nondomestic = ['THEFT', 'BATTERY', 'CRIMINAL DAMAGE', 'MOTOR VEHICLE THEFT', 'ASSAULT']
selected_types = list(set(top_domestic + top_nondomestic))

filtered_crime = crime_by_type_date[selected_types]
correlation_matrix = filtered_crime.corr()
print(correlation_matrix)
Primary Type          BATTERY     THEFT  OTHER OFFENSE   ASSAULT  \
Primary Type                                                       
BATTERY              1.000000  0.230392       0.074937  0.069857   
THEFT                0.230392  1.000000       0.219587  0.188498   
OTHER OFFENSE        0.074937  0.219587       1.000000  0.068052   
ASSAULT              0.069857  0.188498       0.068052  1.000000   
MOTOR VEHICLE THEFT  0.256005  0.426658       0.174815  0.170638   
CRIMINAL DAMAGE      0.193837  0.343249       0.149269  0.126951   

Primary Type         MOTOR VEHICLE THEFT  CRIMINAL DAMAGE  
Primary Type                                               
BATTERY                         0.256005         0.193837  
THEFT                           0.426658         0.343249  
OTHER OFFENSE                   0.174815         0.149269  
ASSAULT                         0.170638         0.126951  
MOTOR VEHICLE THEFT             1.000000         0.352623  
CRIMINAL DAMAGE                 0.352623         1.000000  

Insight For number 10 : Base on the result we can see that there is a correlation between domestic and non domestic theres a few that has a very low correlation but still all of theresult show nothing is not correlated this mean there is a correllation between this two.

11.What is the number of crime on apartment monthly?

In [26]:
df
Out[26]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [27]:
apartment_df = df[df['Location Description'] == 'APARTMENT']
apartment_df['Date'] = pd.to_datetime(apartment_df['Date'])
monthly_apartment = apartment_df.groupby(apartment_df['Date'].dt.to_period('M')).size()

plt.figure(figsize=(10,5))
monthly_apartment.plot(marker='o')
plt.title('Monthly Crimes in Apartments')
plt.xlabel('Month')
plt.ylabel('Number of Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight : Base on the plot we can see that there is a lot of crimes on apartment every month but it drops on april 2025 , by taking a look of it closer we can see that it starts on april 2024 we can see that the crime rate is so low and its end on april 2025 low again its proves that over a year they fixed or make a solution to lessen the crime rate on apartments

In [28]:
df
Out[28]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

12.How many arrest rate on gas stations??

In [29]:
gas_df = df[df['Location Description'] == 'GAS STATION']
gas_df['Arrest'] = gas_df['Arrest'].astype(int)
arrest_rate_gas = gas_df['Arrest'].mean()

plt.pie([arrest_rate_gas, 1-arrest_rate_gas], labels=['Arrested','Not Arrested'], autopct='%1.1f%%', colors=['#66b3ff','#ff9999'])
plt.title('Arrest Rate at Gas Stations')
plt.show()
No description has been provided for this image

Insight : Base on the pie chart we can see that the arrest rate on the gas statio is just 22.8% which means there are only few people that get arrested on that particular location.

13.Number of Domestic vs Non-Domestic Crimes in Sidewalks.

In [32]:
df.columns
Out[32]:
Index(['ID', 'Case Number', 'Date', 'Block', 'IUCR', 'Primary Type',
       'Description', 'Location Description', 'Arrest', 'Domestic', 'Beat',
       'District', 'Ward', 'Community Area', 'FBI Code', 'X Coordinate',
       'Y Coordinate', 'Year', 'Updated On', 'Latitude', 'Longitude',
       'Location', 'YearMonth', 'Weekday'],
      dtype='object')
In [31]:
sidewalk_df = df[df['Location Description'] == 'SIDEWALK']
sidewalk_df['Domestic'] = sidewalk_df['Domestic'].astype(int)
domestic_counts = sidewalk_df['Domestic'].value_counts()

plt.bar(['Non-Domestic','Domestic'], domestic_counts.sort_index(), color=['#a1dab4','#41b6c4'])
plt.title('Domestic vs Non-Domestic Crimes on Sidewalks')
plt.ylabel('Number of Crimes')
plt.show()
No description has been provided for this image

Insight: Base on the bar graph we can see that theres a lot of non domestic crime on sidewalks like a lot because if we compare the two we can clearly see that domestic crimes is very low on sidewalks.

14.What are the crime types on Airport Parking Lot?

In [38]:
df['Location Description'].unique()
Out[38]:
array(['APARTMENT', 'COMMERCIAL / BUSINESS OFFICE', 'STREET', 'RESIDENCE',
       'RESIDENCE - PORCH / HALLWAY', 'RESTAURANT',
       'HOSPITAL BUILDING / GROUNDS', 'ATHLETIC CLUB',
       'PARKING LOT / GARAGE (NON RESIDENTIAL)', 'VEHICLE NON-COMMERCIAL',
       'SIDEWALK', 'OTHER (SPECIFY)', 'SCHOOL - PUBLIC BUILDING',
       'DRIVEWAY - RESIDENTIAL', 'BAR OR TAVERN', 'ALLEY', 'DRUG STORE',
       'SMALL RETAIL STORE', 'RESIDENCE - GARAGE', 'PARK PROPERTY',
       'CONVENIENCE STORE', 'HOTEL / MOTEL', 'SCHOOL - PUBLIC GROUNDS',
       'BOAT / WATERCRAFT', 'CHA PARKING LOT / GROUNDS',
       'POLICE FACILITY / VEHICLE PARKING LOT',
       'AIRPORT TERMINAL UPPER LEVEL - NON-SECURE AREA',
       'AIRPORT PARKING LOT', 'AIRPORT EXTERIOR - NON-SECURE AREA',
       'AIRPORT TERMINAL UPPER LEVEL - SECURE AREA', 'BANK',
       'CTA PARKING LOT / GARAGE / OTHER PROPERTY', 'DEPARTMENT STORE',
       'VACANT LOT / LAND', 'CHURCH / SYNAGOGUE / PLACE OF WORSHIP',
       'NURSING / RETIREMENT HOME', 'GAS STATION',
       'RESIDENCE - YARD (FRONT / BACK)', 'GROCERY FOOD STORE',
       'CTA TRAIN', 'AIRPORT TERMINAL MEZZANINE - NON-SECURE AREA',
       'AIRPORT EXTERIOR - SECURE AREA', 'CEMETARY', 'CTA BUS',
       'CTA STATION', 'CTA PLATFORM', 'CHA APARTMENT',
       'AIRPORT TERMINAL LOWER LEVEL - SECURE AREA',
       'TAVERN / LIQUOR STORE', 'CONSTRUCTION SITE', 'WAREHOUSE',
       'CAR WASH', 'SCHOOL - PRIVATE GROUNDS', nan, 'DAY CARE CENTER',
       'AIRPORT TERMINAL LOWER LEVEL - NON-SECURE AREA',
       'FEDERAL BUILDING', 'GOVERNMENT BUILDING / PROPERTY',
       'VEHICLE - COMMERCIAL', 'COLLEGE / UNIVERSITY - GROUNDS',
       'AUTO / BOAT / RV DEALERSHIP', 'CTA BUS STOP', 'LIBRARY',
       'BARBERSHOP', 'TAXICAB', 'CHA HALLWAY / STAIRWELL / ELEVATOR',
       'ABANDONED BUILDING', 'MOVIE HOUSE / THEATER', 'BOWLING ALLEY',
       'APPLIANCE STORE', 'OTHER COMMERCIAL TRANSPORTATION',
       'SCHOOL - PRIVATE BUILDING',
       'AIRPORT BUILDING NON-TERMINAL - SECURE AREA',
       'VEHICLE - OTHER RIDE SHARE SERVICE (LYFT, UBER, ETC.)',
       'MEDICAL / DENTAL OFFICE', 'COIN OPERATED MACHINE',
       'CURRENCY EXCHANGE', 'JAIL / LOCK-UP FACILITY',
       'ATM (AUTOMATIC TELLER MACHINE)', 'AIRCRAFT',
       'OTHER RAILROAD PROPERTY / TRAIN DEPOT', 'FIRE STATION',
       'VACANT LOT', 'AIRPORT BUILDING NON-TERMINAL - NON-SECURE AREA',
       'HOUSE', 'SPORTS ARENA / STADIUM',
       'LAKEFRONT / WATERFRONT / RIVERBANK', 'DRIVEWAY', 'CLEANING STORE',
       'ANIMAL HOSPITAL', 'BRIDGE', 'HIGHWAY / EXPRESSWAY',
       'FACTORY / MANUFACTURING BUILDING', 'VEHICLE - DELIVERY TRUCK',
       'PAWN SHOP', 'PARKING LOT', 'PORCH', 'AUTO',
       'VEHICLE - COMMERCIAL: TROLLEY BUS',
       'COLLEGE / UNIVERSITY - RESIDENCE HALL',
       'AIRPORT TRANSPORTATION SYSTEM (ATS)',
       'AIRPORT VENDING ESTABLISHMENT', 'YARD', 'CREDIT UNION',
       'POOL ROOM', 'FOREST PRESERVE',
       'VEHICLE - COMMERCIAL: ENTERTAINMENT / PARTY BUS', 'CTA PROPERTY',
       'CHA GROUNDS', 'HOSPITAL', 'NEWSSTAND', 'FARM', 'TAVERN',
       'BASEMENT', 'CASINO/GAMBLING ESTABLISHMENT', 'KENNEL',
       'CTA TRACKS - RIGHT OF WAY', 'CHA HALLWAY', 'GANGWAY',
       'BARBER SHOP/BEAUTY SALON', 'RAILROAD PROPERTY', 'OFFICE',
       'HALLWAY', 'STAIRWELL', 'SAVINGS AND LOAN', 'RETAIL STORE',
       'HOTEL', 'CHA STAIRWELL', 'CTA "L" PLATFORM'], dtype=object)
In [48]:
parking_df = df[df['Location Description'] == 'AIRPORT PARKING LOT']
top_parking_types = parking_df['Primary Type'].value_counts().head(10)

plt.figure(figsize=(20,10))
plt.bar(top_parking_types.index, top_parking_types.values, color='orange')
plt.title('Crime Types in Airport Parking Lot')
plt.xlabel('Number of Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight: Base on the bar plot we can see that motor vehicle theft is the common crime on airport parking lot and robbery is too low it means that theres a lot of motor vehicle theft happening before and after flights.

15.Crimes by Hour in RESIDENCE.

In [50]:
residence_df = df[df['Location Description'] == 'RESIDENCE']
residence_df['Date'] = pd.to_datetime(residence_df['Date'])
residence_df['Hour'] = residence_df['Date'].dt.hour
hourly_counts = residence_df['Hour'].value_counts().sort_index()

plt.plot(hourly_counts.index, hourly_counts.values, marker='o')
plt.title('Crimes by Hour in Residences')
plt.xlabel('Hour of Day')
plt.ylabel('Number of Crimes')
plt.show()
No description has been provided for this image

Insight : Base on the plot most crimes in homes happen around 9 AM and 5 PM. That is when people are usually leaving or coming back so it might be easier for crimes to happen at that time.

16.What is the maximum of crimes in a Day on the STREET?

In [51]:
df
Out[51]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [56]:
street_df = df[df['Location Description'] == 'STREET']
street_df['Date'] = pd.to_datetime(street_df['Date'])
daily_street = street_df.groupby(street_df['Date'].dt.date).size()
max_street_day = daily_street.idxmax()
max_street_value = daily_street.max()

plt.figure(figsize=(12,5))
daily_street.plot()
plt.axvline(max_street_day, color='red', linestyle='--', label=f'Max: {max_street_day} ({max_street_value})')
plt.title('Daily Crimes on Streets')
plt.xlabel('Date')
plt.ylabel('Number of Crimes')
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight : Base on the visualization we cansee that the most crimes on streets happened on August 4, 2024 base on the avxline. That day had the highest number of street crimes .

17.What is the Proportion of Arrests in SCHOOL - PUBLIC GROUNDS?

In [58]:
df['Location Description'].unique()
Out[58]:
array(['APARTMENT', 'COMMERCIAL / BUSINESS OFFICE', 'STREET', 'RESIDENCE',
       'RESIDENCE - PORCH / HALLWAY', 'RESTAURANT',
       'HOSPITAL BUILDING / GROUNDS', 'ATHLETIC CLUB',
       'PARKING LOT / GARAGE (NON RESIDENTIAL)', 'VEHICLE NON-COMMERCIAL',
       'SIDEWALK', 'OTHER (SPECIFY)', 'SCHOOL - PUBLIC BUILDING',
       'DRIVEWAY - RESIDENTIAL', 'BAR OR TAVERN', 'ALLEY', 'DRUG STORE',
       'SMALL RETAIL STORE', 'RESIDENCE - GARAGE', 'PARK PROPERTY',
       'CONVENIENCE STORE', 'HOTEL / MOTEL', 'SCHOOL - PUBLIC GROUNDS',
       'BOAT / WATERCRAFT', 'CHA PARKING LOT / GROUNDS',
       'POLICE FACILITY / VEHICLE PARKING LOT',
       'AIRPORT TERMINAL UPPER LEVEL - NON-SECURE AREA',
       'AIRPORT PARKING LOT', 'AIRPORT EXTERIOR - NON-SECURE AREA',
       'AIRPORT TERMINAL UPPER LEVEL - SECURE AREA', 'BANK',
       'CTA PARKING LOT / GARAGE / OTHER PROPERTY', 'DEPARTMENT STORE',
       'VACANT LOT / LAND', 'CHURCH / SYNAGOGUE / PLACE OF WORSHIP',
       'NURSING / RETIREMENT HOME', 'GAS STATION',
       'RESIDENCE - YARD (FRONT / BACK)', 'GROCERY FOOD STORE',
       'CTA TRAIN', 'AIRPORT TERMINAL MEZZANINE - NON-SECURE AREA',
       'AIRPORT EXTERIOR - SECURE AREA', 'CEMETARY', 'CTA BUS',
       'CTA STATION', 'CTA PLATFORM', 'CHA APARTMENT',
       'AIRPORT TERMINAL LOWER LEVEL - SECURE AREA',
       'TAVERN / LIQUOR STORE', 'CONSTRUCTION SITE', 'WAREHOUSE',
       'CAR WASH', 'SCHOOL - PRIVATE GROUNDS', nan, 'DAY CARE CENTER',
       'AIRPORT TERMINAL LOWER LEVEL - NON-SECURE AREA',
       'FEDERAL BUILDING', 'GOVERNMENT BUILDING / PROPERTY',
       'VEHICLE - COMMERCIAL', 'COLLEGE / UNIVERSITY - GROUNDS',
       'AUTO / BOAT / RV DEALERSHIP', 'CTA BUS STOP', 'LIBRARY',
       'BARBERSHOP', 'TAXICAB', 'CHA HALLWAY / STAIRWELL / ELEVATOR',
       'ABANDONED BUILDING', 'MOVIE HOUSE / THEATER', 'BOWLING ALLEY',
       'APPLIANCE STORE', 'OTHER COMMERCIAL TRANSPORTATION',
       'SCHOOL - PRIVATE BUILDING',
       'AIRPORT BUILDING NON-TERMINAL - SECURE AREA',
       'VEHICLE - OTHER RIDE SHARE SERVICE (LYFT, UBER, ETC.)',
       'MEDICAL / DENTAL OFFICE', 'COIN OPERATED MACHINE',
       'CURRENCY EXCHANGE', 'JAIL / LOCK-UP FACILITY',
       'ATM (AUTOMATIC TELLER MACHINE)', 'AIRCRAFT',
       'OTHER RAILROAD PROPERTY / TRAIN DEPOT', 'FIRE STATION',
       'VACANT LOT', 'AIRPORT BUILDING NON-TERMINAL - NON-SECURE AREA',
       'HOUSE', 'SPORTS ARENA / STADIUM',
       'LAKEFRONT / WATERFRONT / RIVERBANK', 'DRIVEWAY', 'CLEANING STORE',
       'ANIMAL HOSPITAL', 'BRIDGE', 'HIGHWAY / EXPRESSWAY',
       'FACTORY / MANUFACTURING BUILDING', 'VEHICLE - DELIVERY TRUCK',
       'PAWN SHOP', 'PARKING LOT', 'PORCH', 'AUTO',
       'VEHICLE - COMMERCIAL: TROLLEY BUS',
       'COLLEGE / UNIVERSITY - RESIDENCE HALL',
       'AIRPORT TRANSPORTATION SYSTEM (ATS)',
       'AIRPORT VENDING ESTABLISHMENT', 'YARD', 'CREDIT UNION',
       'POOL ROOM', 'FOREST PRESERVE',
       'VEHICLE - COMMERCIAL: ENTERTAINMENT / PARTY BUS', 'CTA PROPERTY',
       'CHA GROUNDS', 'HOSPITAL', 'NEWSSTAND', 'FARM', 'TAVERN',
       'BASEMENT', 'CASINO/GAMBLING ESTABLISHMENT', 'KENNEL',
       'CTA TRACKS - RIGHT OF WAY', 'CHA HALLWAY', 'GANGWAY',
       'BARBER SHOP/BEAUTY SALON', 'RAILROAD PROPERTY', 'OFFICE',
       'HALLWAY', 'STAIRWELL', 'SAVINGS AND LOAN', 'RETAIL STORE',
       'HOTEL', 'CHA STAIRWELL', 'CTA "L" PLATFORM'], dtype=object)
In [60]:
school_df = df[df['Location Description'] == 'SCHOOL - PUBLIC GROUNDS']
school_df['Arrest'] = school_df['Arrest'].astype(int)
arrest_prop = school_df['Arrest'].value_counts(normalize=True)

plt.bar(['Not Arrested','Arrested'], arrest_prop.sort_index(), color=['#fdae61','#2b83ba'])
plt.title('Proportion of Arrests in Public School Buildings')
plt.ylabel('Proportion')
plt.show()
No description has been provided for this image

Insight : We can see in the result that theres a few people that is arrested on Schools while the not arrested is a lot i think these means they did not arrest those people so theres no student will get hurt if the suspect lost control or if he or she fight the police

18.How many crimes per Month in Restaurants?

In [61]:
df
Out[61]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [63]:
resto_df = df[df['Location Description'] == 'RESTAURANT']
resto_df['Date'] = pd.to_datetime(resto_df['Date'])
resto_monthly = resto_df.groupby(resto_df['Date'].dt.to_period('M')).size()
mean_resto = resto_monthly.mean()

plt.bar(resto_monthly.index.astype(str), resto_monthly.values, color='green')
plt.title('Monthly Crimes in Restaurants')
plt.xlabel('Month')
plt.ylabel('Number of Crimes')
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight : base on the bar graph we can see that the crimes in restaurants happened every month, but some months had more than others.

19.How many crime on bar or taverns in chicago?

In [64]:
bar_df = df[df['Location Description'] == 'BAR OR TAVERN']
desc_counts = bar_df['Description'].value_counts().head(7)

plt.pie(desc_counts.values, labels=desc_counts.index, autopct='%1.1f%%', startangle=140)
plt.title('Top Crime Descriptions in Bars/Taverns')
plt.tight_layout()
plt.show()
No description has been provided for this image

iNSIGHT : Base on the pie chart we can see that there is a lot of simple crime on bar or taverns and a few of pick pocket. i think pickpocket happens if the victim is drunk.

20.Yearly crimes on department store.

In [65]:
dept_df = df[(df['Location Description'] == 'DEPARTMENT STORE') & (df['Primary Type'] == 'THEFT')]
dept_df['Date'] = pd.to_datetime(dept_df['Date'])
dept_df['Year'] = dept_df['Date'].dt.year
yearly_theft = dept_df.groupby('Year').size()

plt.bar(yearly_theft.index, yearly_theft.values, color='#b2182b')
plt.title('Yearly THEFT Crimes in Department Stores')
plt.xlabel('Year')
plt.ylabel('Number of THEFT Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight : BAse on the barplot we can see that the theft in department stores went down each year. That means fewer stealing incidents happened over time.its like the store got better at stopping shoplifters.

In [66]:
df
Out[66]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

21. How many theft crimes over time?

In [67]:
theft_df = df[df['Primary Type'] == 'THEFT']
theft_df['Date'] = pd.to_datetime(theft_df['Date'])
monthly_theft = theft_df.groupby(theft_df['Date'].dt.to_period('M')).size()

plt.figure(figsize=(10,5))
monthly_theft.plot(marker='o', color='red')
plt.title('Monthly THEFT Crimes')
plt.xlabel('Month')
plt.ylabel('Number of THEFT Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the plot we can see that theft crimes went up around June and July 2024, then slowly went down by April 2025.

22.Which types of battery crimes happen most often, and are they usually domestic or more violent?

In [73]:
battery_df = df[df['Primary Type'] == 'BATTERY']
top_battery_desc = battery_df['Description'].value_counts().head(5)

plt.barh(top_battery_desc.index, top_battery_desc.values, color='purple')
plt.title('BATTERY Crime')
plt.xlabel('Number of Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight : Base on the visualization most battery crimes are domestic and simple, they usually happen between people who know each other, like family or partners. More serious battery crimes, like those with weapons, and it happen less or not happening more times.

23.How many people get arrested for doing narcotic crimes??

In [69]:
narcotics_df = df[df['Primary Type'] == 'NARCOTICS']
narcotics_df['Arrest'] = narcotics_df['Arrest'].astype(int)
arrest_prop = narcotics_df['Arrest'].value_counts(normalize=True)

plt.pie(arrest_prop, labels=['Not Arrested','Arrested'], autopct='%1.1f%%', colors=['#fdae61','#2b83ba'])
plt.title('Arrest Proportion for NARCOTICS Crimes')
plt.show()
No description has been provided for this image

Insight : Base on the pie chart we can clearly see that theres a few people only get arrested by doing narcotic crine and there are 95.8% taht is not arrested.

24.What kind of things get damaged the most in criminal cases?

In [72]:
damage_df = df[df['Primary Type'] == 'CRIMINAL DAMAGE']
top_damage_desc = damage_df['Description'].value_counts().head(7)

plt.bar(top_damage_desc.index, top_damage_desc.values, color='orange')
plt.title('CRIMINAL DAMAGE')
plt.ylabel('Number of Crimes')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the barplot we can see that vehicle is the highest one that always get damage by criminal i thnk on car chases the car crasher and the second one is the properties i think that is connected to the vehicle when it hits the property they both get damaged.

25. Is there a correlation between the vehicle and property in criminal damage??

In [75]:
damage_df = df[df['Primary Type'] == 'CRIMINAL DAMAGE']
damage_df['Date'] = pd.to_datetime(damage_df['Date'])
damage_df['Day'] = damage_df['Date'].dt.date
daily_damage = damage_df.groupby(['Day', 'Description']).size().unstack(fill_value=0)
selected = daily_damage[['TO PROPERTY', 'TO VEHICLE']]
correlation = selected.corr()

sns.heatmap(correlation, annot=True, cmap='coolwarm')
plt.title('Correlation between Property and Vehicle Damage')
plt.show()
No description has been provided for this image

Insight : Base on the result we can see that there is the correlation between the two like i said on number 24 if the vehicle crashes on the property they both get damaged.

26.how many assualt crimes are there monthly in chicago??

In [76]:
assault_df = df[df['Primary Type'] == 'ASSAULT']
assault_df['Date'] = pd.to_datetime(assault_df['Date'])
monthly_assault = assault_df.groupby(assault_df['Date'].dt.to_period('M')).size()

plt.plot(monthly_assault.index.astype(str), monthly_assault.values, marker='o', color='green')
plt.title('Monthly ASSAULT Crimes')
plt.xlabel('Month')
plt.ylabel('Number of ASSAULT Crimes')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Based on the plot we can see that assault crimes changed month to month like it rises and go down and rise again, but they dropped a lot by March 2022.

27.Number of robbery crimes in chicago daily?

In [78]:
robbery_df = df[df['Primary Type'] == 'ROBBERY']
robbery_df['Date'] = pd.to_datetime(robbery_df['Date'])
daily_robbery = robbery_df.groupby(robbery_df['Date'].dt.date).size()
max_robbery_day = daily_robbery.idxmax()
max_robbery_value = daily_robbery.max()

plt.figure(figsize=(12,5))
daily_robbery.plot()
plt.title('Daily ROBBERY Crimes')
plt.xlabel('Date')
plt.ylabel('Number of ROBBERY Crimes')
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image
In [79]:
robbery_df = df[df['Primary Type'] == 'ROBBERY']
robbery_df
Out[79]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
37 13428473 JH224225 2024-04-14 00:05:00 008XX S MORGAN ST 0340 ROBBERY ATTEMPT STRONG ARM - NO WEAPON STREET 1 0 ... 03 1170047.0 1896351.0 2024 12/21/2024 03:40:46 PM 41.871083 -87.651163 (41.871083204, -87.651162509) 2024-04 Sunday
57 13428954 JH224898 2024-04-14 00:30:00 047XX N KILPATRICK AVE 0320 ROBBERY STRONG ARM - NO WEAPON SIDEWALK 0 0 ... 03 1144180.0 1930999.0 2024 12/21/2024 03:40:46 PM 41.966686 -87.745258 (41.966685706, -87.745258076) 2024-04 Sunday
95 13428564 JH224312 2024-04-14 01:25:00 004XX W 24TH ST 0320 ROBBERY STRONG ARM - NO WEAPON COMMERCIAL / BUSINESS OFFICE 0 1 ... 03 1173417.0 1888352.0 2024 12/21/2024 03:40:46 PM 41.849059 -87.639028 (41.849059202, -87.639027598) 2024-04 Sunday
112 13428508 JH224290 2024-04-14 01:52:00 001XX N LAPORTE AVE 0320 ROBBERY STRONG ARM - NO WEAPON VEHICLE NON-COMMERCIAL 0 0 ... 03 1143351.0 1900736.0 2024 12/21/2024 03:40:46 PM 41.883657 -87.749064 (41.883656549, -87.749064412) 2024-04 Sunday
148 13428659 JH224447 2024-04-14 02:33:00 0000X W DIVISION ST 0320 ROBBERY STRONG ARM - NO WEAPON STREET 0 0 ... 03 1175906.0 1908369.0 2024 12/21/2024 03:40:46 PM 41.903931 -87.629290 (41.903931469, -87.629290187) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
248924 13806459 JJ218927 2025-04-11 17:30:00 078XX S HERMITAGE AVE 031A ROBBERY ARMED - HANDGUN ALLEY 0 0 ... 03 1166043.0 1852683.0 2025 04/19/2025 03:41:24 PM 41.751340 -87.667105 (41.751339673, -87.667105463) 2025-04 Friday
249014 13803020 JJ214803 2025-04-11 20:30:00 047XX N LAWNDALE AVE 0312 ROBBERY ARMED - KNIFE / CUTTING INSTRUMENT ALLEY 0 0 ... 03 1150866.0 1931287.0 2025 04/19/2025 03:41:24 PM 41.967347 -87.720667 (41.967347439, -87.720666884) 2025-04 Friday
249042 13803577 JJ214845 2025-04-11 21:40:00 001XX W LAKE ST 0320 ROBBERY STRONG ARM - NO WEAPON CTA TRAIN 0 0 ... 03 1175334.0 1901735.0 2025 04/19/2025 03:41:24 PM 41.885740 -87.631591 (41.885740288, -87.631590568) 2025-04 Friday
249081 13803748 JJ215553 2025-04-11 23:00:00 019XX W MADISON ST 0320 ROBBERY STRONG ARM - NO WEAPON SIDEWALK 0 0 ... 03 1163655.0 1900042.0 2025 04/19/2025 03:41:24 PM 41.881349 -87.674526 (41.881348623, -87.674525788) 2025-04 Friday
249099 13803128 JJ214903 2025-04-11 23:39:00 035XX N ORIOLE AVE 0334 ROBBERY ATTEMPT ARMED - KNIFE / CUTTING INSTRUMENT SIDEWALK 0 0 ... 03 1124794.0 1922510.0 2025 04/19/2025 03:41:24 PM 41.943734 -87.816727 (41.943733887, -87.816727097) 2025-04 Friday

8200 rows × 24 columns

Insight : Base on the plot even its so messy robbery crimes change a lot day by day, but some days had big rise it is like there is a robberry today but no tommorow like its alternate .

28.How many people got their vehicle got stolen in chicago?

In [80]:
mvt_df = df[df['Primary Type'] == 'MOTOR VEHICLE THEFT']
top_mvt_desc = mvt_df['Description'].value_counts().head(6)

plt.barh(top_mvt_desc.index, top_mvt_desc.values, color='teal')
plt.title('Top MOTOR VEHICLE THEFT Descriptions')
plt.xlabel('Number of Crimes')
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : BAse on the bar plot we can see that there is a lot of crimes on automobile that got stolen and a few for buses abd truck.

29.How many arrest rate on theft on chicago?

In [81]:
theft_df = df[df['Primary Type'] == 'THEFT']
top_theft_desc = theft_df['Description'].value_counts().head(3).index
arrest_rates = []
for desc in top_theft_desc:
    sub_df = theft_df[theft_df['Description'] == desc]
    sub_df['Arrest'] = sub_df['Arrest'].astype(int)
    arrest_rates.append(sub_df['Arrest'].mean())

plt.bar(top_theft_desc, arrest_rates, color='navy')
plt.title('Arrest Rate for Top 3 THEFT Descriptions')
plt.ylabel('Arrest Rate')
plt.ylim(0,1)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : BAse on the bar plot we can see that the retail theft has the highest arrest rate among all the theft types, much higher than thefts over or under ₱500. That means shoplifting cases are more likely to lead to arrests than other kinds of stealing. It’s like stores catch more thieves than the streets do because theres a security guards on stores.

30.Weekday battery crimes on chicago.

In [82]:
df
Out[82]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [83]:
attery_df = df[df['Primary Type'] == 'BATTERY']
battery_df['Date'] = pd.to_datetime(battery_df['Date'])
battery_df['Weekday'] = battery_df['Date'].dt.day_name()
weekday_counts = battery_df['Weekday'].value_counts().reindex(
    ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday']
)

plt.bar(weekday_counts.index, weekday_counts.values, color='crimson')
plt.title('BATTERY Crimes per Weekday')
plt.ylabel('Number of Crimes')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
No description has been provided for this image

INsight : Base on the bar plot we can see that there is a lot of battery crimes on sunday and in saturday it means criminals acts more on week end but on monday to friday those number are still high but the saturday is sunday is just higher battery crime rates.

31.correlation between Arrest and Domestic for criminal trespass.

In [84]:
trespass_df = df[df['Primary Type'] == 'CRIMINAL TRESPASS']
trespass_df['Arrest'] = trespass_df['Arrest'].astype(int)
trespass_df['Domestic'] = trespass_df['Domestic'].astype(int)
corr = trespass_df[['Arrest', 'Domestic']].corr().iloc[0,1]

plt.figure(figsize=(4,4))
sns.heatmap(trespass_df[['Arrest', 'Domestic']].corr(), annot=True, cmap='coolwarm')
plt.title(f'Correlation: Arrest vs Domestic (TRESPASS) ({corr:.2f})')
plt.show()
No description has been provided for this image

Insight : Base on the heatmap domestic doesnt really affect the arrest on trespasses becaise there are not correlated.

In [85]:
df
Out[85]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

32.Which areas in Chicago experience the most crime?

In [88]:
import folium
from folium.plugins import HeatMap
In [ ]:
df_map = df.dropna(subset=['Latitude', 'Longitude'])
m = folium.Map(location=[41.8781, -87.6298], zoom_start=11)
HeatMap(data=df_map[['Latitude', 'Longitude']].values, radius=8, blur=12, max_zoom=1).add_to(m)

legend_html = '''
 <div style="position: fixed; 
     bottom: 50px; left: 50px; width: 180px; height: 90px; 
     background-color: white; z-index:9999; font-size:14px;
     border:2px solid grey; border-radius:8px; padding: 10px;">
     <b>Crime Density Legend</b><br>
     <i style="background: #ffffb2; width: 20px; height: 10px; display: inline-block;"></i> High<br>
     <i style="background: #fd8d3c; width: 20px; height: 10px; display: inline-block;"></i> Medium<br>
     <i style="background: #bd0026; width: 20px; height: 10px; display: inline-block;"></i> Low
 </div>
'''
m.get_root().html.add_child(folium.Element(legend_html))
m
#I improve this code using ai to make the viewer determine what is the color for
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Insight : Based on the visualization heatmap we can see that the areas with the yellow color represent the highest crime rate and the orange for medium and the red for lowest crime rate and there a lot crimes that really happened on chicago.

In [94]:
df
Out[94]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

33.What loation has the highest arrest rate in chicago?

In [105]:
top_locs = df['Location Description'].value_counts().head(5).index
arrest_rates = df[df['Location Description'].isin(top_locs)].groupby('Location Description')['Arrest'].mean().reindex(top_locs)

plt.figure(figsize=(10,5))
plt.bar(arrest_rates.index, arrest_rates.values, color='royalblue')
plt.title('Arrest Rate for Locations')
plt.ylabel('Arrest Rate')
plt.ylim(0,1)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the bar graph we can see that the sidewalk has the highest arrest rate in chicago which means i think the criminal is being chased and got arrested on side walks and also some crimes inn chicago happens on it.

34.Arrest Rate by Description for a Specific Primary Type.

In [106]:
df
Out[106]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

In [108]:
primary_type = 'WEAPONS VIOLATION'  # Change to any type you want
subset = df[df['Primary Type'] == primary_type]
arrest_rates = subset.groupby('Description')['Arrest'].mean().sort_values(ascending=False)

plt.figure(figsize=(10,15))
arrest_rates.plot(kind='bar', color='firebrick')
plt.title(f'Arrest Rate by Description for {primary_type}')
plt.ylabel('Arrest Rate')
plt.ylim(0,1)
plt.tight_layout()
plt.show()
No description has been provided for this image

Insight : Base on the bar graph we can see that arrest type happens a lot of times if its related to firearm or the suspect have guns and less for reckless firearm discharges.

In [109]:
df
Out[109]:
ID Case Number Date Block IUCR Primary Type Description Location Description Arrest Domestic ... FBI Code X Coordinate Y Coordinate Year Updated On Latitude Longitude Location YearMonth Weekday
0 13439321 JH237424 2024-04-14 040XX S PRAIRIE AVE 0890 THEFT FROM BUILDING APARTMENT 0 0 ... 06 1178707.0 1878256.0 2024 12/21/2024 03:40:46 PM 41.821236 -87.619921 (41.821236024, -87.619920712) 2024-04 Sunday
1 13437420 JH234779 2024-04-14 023XX W CERMAK RD 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE COMMERCIAL / BUSINESS OFFICE 0 0 ... 26 1161210.0 1889347.0 2024 12/21/2024 03:40:46 PM 41.852052 -87.683801 (41.852051675, -87.683800849) 2024-04 Sunday
2 13428676 JH224478 2024-04-14 043XX W LE MOYNE ST 0917 MOTOR VEHICLE THEFT CYCLE, SCOOTER, BIKE WITH VIN STREET 0 0 ... 07 1146960.0 1909501.0 2024 12/21/2024 03:40:46 PM 41.907640 -87.735587 (41.907640473, -87.735587478) 2024-04 Sunday
3 13429357 JH225293 2024-04-14 039XX W ADAMS ST 143A WEAPONS VIOLATION UNLAWFUL POSSESSION - HANDGUN STREET 1 0 ... 15 1150158.0 1898721.0 2024 12/21/2024 03:40:46 PM 41.877997 -87.724121 (41.877997275, -87.724120826) 2024-04 Sunday
4 13430098 JH226395 2024-04-14 011XX W 112TH PL 0890 THEFT FROM BUILDING RESIDENCE 0 0 ... 06 1170856.0 1830157.0 2024 12/21/2024 03:40:46 PM 41.689421 -87.650123 (41.6894214, -87.650123247) 2024-04 Sunday
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249118 13805239 JJ217509 2025-04-12 029XX W LOGAN BLVD 2826 OTHER OFFENSE HARASSMENT BY ELECTRONIC MEANS APARTMENT 0 0 ... 26 1156478.0 1917149.0 2025 04/19/2025 03:41:24 PM 41.928440 -87.700416 (41.928439867, -87.700415972) 2025-04 Saturday
249119 13804023 JJ215813 2025-04-12 094XX S HARVARD AVE 0430 BATTERY AGGRAVATED - OTHER DANGEROUS WEAPON STREET 0 0 ... 04B 1175694.0 1842631.0 2025 04/19/2025 03:41:24 PM 41.723545 -87.632040 (41.723545182, -87.632039508) 2025-04 Saturday
249120 13803926 JJ215943 2025-04-12 084XX S VINCENNES AVE 0486 BATTERY DOMESTIC BATTERY SIMPLE APARTMENT 0 1 ... 08B 1173850.0 1848976.0 2025 04/19/2025 03:41:24 PM 41.740998 -87.638606 (41.74099774, -87.638606337) 2025-04 Saturday
249121 13803475 JJ215338 2025-04-12 050XX S ABERDEEN ST 0530 ASSAULT AGGRAVATED - OTHER DANGEROUS WEAPON STREET 1 0 ... 04A 1169838.0 1871348.0 2025 04/19/2025 03:41:24 PM 41.802477 -87.652657 (41.802477219, -87.652657244) 2025-04 Saturday
249122 13804512 JJ216668 2025-04-12 012XX W CARROLL AVE 0710 THEFT THEFT FROM MOTOR VEHICLE STREET 0 0 ... 06 1168216.0 1902390.0 2025 04/19/2025 03:41:24 PM 41.887694 -87.657710 (41.887694407, -87.657710204) 2025-04 Saturday

249123 rows × 24 columns

35. What is the correlation between Arrest and Domestic in Battery crimes in chicago?

In [110]:
battery_df = df[df['Primary Type'] == 'BATTERY']
battery_df['Arrest'] = battery_df['Arrest'].astype(int)
battery_df['Domestic'] = battery_df['Domestic'].astype(int)
corr = battery_df[['Arrest', 'Domestic']].corr().iloc[0,1]

plt.figure(figsize=(4,4))
sns.heatmap(battery_df[['Arrest', 'Domestic']].corr(), annot=True, cmap='coolwarm')
plt.title(f'Correlation: Arrest vs Domestic (BATTERY) ({corr:.2f})')
plt.show()
No description has been provided for this image

Insight : Base on the result we can see that there is a correlation between Arrest and Domestic in Battery but that is too small so Domestic doesnt really affect the arrest in battery cases in chicago.